The role amenities play in spatial sorting of migrants and their impact on welfare: Evidence from China

From 2005 to 2015, China’s high-skilled labor was increasingly concentrated in cities with high wages and high rents, while a narrowing of the wage gap between high- and low-skilled labor showed an opposite trend to an increase in geographic sorting. In this research, I estimated a spatial equilibrium structural model to identify the causes of this phenomenon and its impact on welfare. Changes in local labor demand essentially led to an increase in skill sorting, and changes in urban amenities further contributed to this trend. An agglomeration of high-skilled labor raised local productivity, increased wages for all workers, reduced the real wage gap, and widened the welfare gap between workers with different skills. In contrast to the welfare effects of changes in the wage gap driven by exogenous productivity changes, changes in urban wages, rents, and amenities increased welfare inequality between high- and low-skilled workers, but this is mainly because the utility of low-skilled workers from urban amenities is constrained by migration costs; if migration costs caused by China’s household registration policy were eliminated, changes in urban wages, rents, and amenities would reduce welfare inequality between high- and low-skilled workers to a greater extent than a reduction in the real wage gap between these two groups.


Introduction
With the wave of college enrollment expansion in China since 1999, the education level of domestic workers has increased. At the same time, a large number of workers have left their household registration areas to work and live elsewhere. From 2005 to 2015, the proportion of migrants nearly tripled, and the proportion of college-educated workers among migrants has grown even faster than the proportion of college-educated workers nationwide, with increasingly more migrants (especially those with a higher education) choosing to live in large cities.
Wages and rents have risen in larger cities relative to smaller ones, while the wage gap between high-and low-skilled labor has narrowed over time. These facts raise some questions: What factors contribute to the growing trend of spatial sorting of the workforce? What are the agglomeration and dispersion forces, respectively? Does a narrowing of the wage gap between high-and low-skilled workers reflect similar changes in the welfare gap between these two groups? Once workers choose to live in a city with high housing costs, the local price level may offset some of the consumption utility derived from high wages, resulting in reduced welfare for workers; alternatively, cities with high local prices may provide desirable urban amenities for workers as compensation for high rents, thereby increasing the workforce's welfare. The impact of growing trends in spatial sorting on welfare depends on key factors that drive highand low-skilled workers to make different choices regarding the cities in which they live.
This paper focuses on the determinants of the choice of different cities by high-and lowskilled workers under sorting trend and the welfare impacts of these choices. By estimating a spatial equilibrium structural model of local labor demand, housing supply, labor supply, and amenities supply, this paper illustrates that changes in the relative demand for high-and lowskilled labor caused by changes in local productivity are the drivers that underpin the differences in high-and low-skilled labor's migration patterns.
While local wage changes can be (and often are) the initial cause of migration, I discovered that cities that attract a disproportionate amount of high-skilled labor will endogenously become more desirable places to live and more productive for all workers living in them. A combination of desirable wages and amenities makes high-skilled workers willing to pay high housing costs to live in these cities. While low-skilled workers also find good wages and amenities desirable, they are unwilling to pay such high living costs, and they have more difficulty accessing adequate urban amenities. Consequently, after weighing the pros and cons, they may choose a more desirable city.
Overall, this paper finds that, as migration costs limit migrants' (especially low-skilled workers') access to local amenities, the welfare effects of changes in local wages, rents, and endogenous amenities lead to increased welfare inequality between high-and low-skilled workers. When migration costs are eliminated and workers get full access to urban resources, the welfare effects of changes in local wages, rents, and endogenous amenities reduce welfare inequality, and the reduction in welfare inequality is greater than the reduction in the real wage gap between high-and low-skilled workers. This paper builds on Diamond's (2016) [1] urban spatial equilibrium structure model by adding settings related to migration costs across cities and characterizing utility losses derived from migrants' limited use of urban amenities due to a lack of local household registration as well as competition with other residents for urban resources. The model adds heterogeneous labor preferences to cities based on the frameworks of Rosen (1979) [2] and Roback (1982) [3]. I used a static discrete choice setup to simulate the labor force's city choices. This model allows workers with different demographics to weigh the relative value of urban features in different ways, which leads them to make different siting decisions.
In this paper, workers with a college education were defined as high-skilled labor, and workers with a high school or less education were defined as low-skilled labor. There are differences in the local productivity levels of high-and low-skilled labor, and the productivity levels of high-and low-skilled labor are influenced by the skill-mix of the city. Thus, changes in the urban skill-mix affect local wages by changing firms' labor supply and demand, and by directly affecting labor productivity. Firms in each city use capital and workers as inputs for production. Housing markets differ across cities due to the heterogeneity of housing supply elasticities.
In addition to treating wages and housing costs as endogenous factors, I allowed amenity supply to respond to skill-mix of the city. To measure urban amenities levels as comprehensively as possible, I collected data on seventeen different amenities in seven categories. I used an autoencoder (AE) to combine these seventeen data sources into a single amenity index.
A two-step estimation method was used to estimate workers' preferences for cities, similar to the setup proposed by McFadden (1973) [4] and the method used by Berry et al. (2004) [5].
First, a conditional logit method was used to determine the average desirability of each city for each type of worker every five years. Then a nonlinear generalized method of moments (GMM) was used to estimate the model, and in this step the estimated utility levels of workers living in each city were used as a dependent variable to estimate how workers trade off wages, rents, and amenities when choosing where to live.
Endogeneity was addressed using local labor demand shocks driven by industrial structure of each city and its interaction term with local housing supply elasticities as instrumental variables. According to the industrial composition of urban employment, differences in productivity changes across industries will have different effects on the demand for high-and lowskilled workers in cities (Bartik, 1991) [6]. Exogenous local productivity changes were measured by interacting differences in composition of employment across industries with changes in industry average wages for high-and low-skilled labor, respectively. Following the literature (Saiz, 2010;Gyourko et al., 2008) [7,8], I set the elasticity of the urban housing supply to vary according to the geographic constraints of developable land around urban centers and land use regulations. The elasticity of a city's housing supply impacts equilibrium wages, rents, and population.
Taken together, the literature focusing on the role of urban amenities in the process of labor demand shocks affecting labor migration generally concludes that the level of urban amenities is a positive driving force for labor agglomeration. On this basis, the findings of this paper further suggest that endogenous local amenity changes are an important mechanism that drives labor migration in response to local labor demand shocks.
A growing body of literature has studied how amenities vary with the composition of an area's residents. (Bayer et al., 2007;Brueckner and Rosenthal, 2009;McKinnish et al., 2010;Guerrieri et al., 2013;Handbury, 2021) [15][16][17][18][19]. Handbury (2021) [19] provided direct evidence that the products and prices offered in the local market are related to the tastes of different income groups. A large amount of urban economics literature argues that these tastes help explain observed spatial disparities in income and skills across cities (Glaeser et al., 2001; Couture and Handbury, 2020) [20,21]: High-skilled and high-income workers tend to make similar decisions about location because they enjoy more utility from locally endogenous amenities than do low-skilled, low-income workers. In this paper, this premise is one of the main forces driving the spatial sorting trend of high-and low-skilled labor. This paper provides empirical support for the theory with the help of a spatial equilibrium model, that changes in skills-biased amenities are the result of reconciling changes in rents and wages with observed changes in the skills composition of a city. A similar procedure is found in Black et al.(2009) [22].
The findings of this paper are also relevant to the literature that studies changes in wage structures and inequality within and between local labor markets ( [14,[23][24][25][26][27]. Of the above literature, the most relevant to this paper is Moretti (2013) [23], who is the first to illustrate the importance of considering the different location choices of high-and low-skilled labor when measuring changes in real wage and welfare inequality.
Another thread of the literature, specifically related to the labor demand estimates in this paper, studies the impact of the relative supply of high-and low-skilled labor on their wages (Card, 2009;Dustmann et al., 2013;Lewis, 2011;Dustmann and Glitz, 2015;Llull, 2018;Foged and Peri, 2016) [28][29][30][31][32][33]. A strand of literature represented by Card (2009) [28] focused on wage and welfare inequality between high-and low-skilled labor. This paper follows the identification strategy proposed by Diamond (2016) [1], which differs from the traditional hedonic method of estimating labor demand at the city level and takes into account endogenous productivity changes.
The labor supply model and estimation take advantage of the discrete choice method developed in the empirical literature on industrial organizations (McFadden, 1973;Nevo, 2001;Fan, 2013;Busso et al., 2013;Berry and Haile, 2014) [4,[34][35][36][37]. This method has also been used in much of the regional economics literature (Bayer et al., 2007;Bayer et al., 2009;Kennan and Walker, 2011) [15,38,39]. However, the models of Bayer et al. (2009) [38] and Kennan and Walker (2011) [39] do not allow local wages and rents to be related to local amenities. In this paper, I used this method to estimate the determinants of urban labor supply. This paper is organized as follows. Section 2 discusses the data and variables, Section 3 presents the stylized facts, Section 4 builds the model, Section 5 discusses the model's estimation techniques, Section 6 presents the parameter estimation, Section 7 discusses the estimation of urban amenities and productivity, Section 8 analyzes the impact of registered population on location choice. Section 9 analyzes the determinants of urban high-skilled labor employment ratio changes, Section 10 presents potential implications for welfare, and Section 11 concludes.

Data sources
This research used 2010 census data as well as 2005 and 2015 mini-census data to calculate the migration flow of labor that has different skill levels, housing rents, and other related items across cities. The 2000 census data were also used in the stylized facts to accurately capture the trends of some indicators. I also used the China Urban Statistical Yearbook, the China Urban Construction Statistical Yearbook, the China Regional Economic Statistical Yearbook, the China County (City) Social and Economic Statistical Yearbook, the China County Statistical Yearbook, and the statistical yearbook of each city and province to obtain comprehensive citylevel data. Also, data from the China Migrants Dynamic Survey (CMDS) were used for some indicators.
The target population of this paper was migrants in China. According to the CMDS definition of migrants, migrants are those who have lived in an inflow area for more than one month, whose household registration is not registered in the local district, and are over 15 years old. In this paper, samples were selected from the dataset according to this definition.
The scope of a city in the manuscript is an entire city at the prefecture level. There are two reasons for not using municipal districts (shixiaqu in Chinese) as the scope criteria. On the one hand, the official census microdata provided by the National Bureau of Statistics is desensitized, and it provides 4-digit address codes that are accurate down to the prefecture level of a city. If a further subdivision is desired, the dataset provides urban and rural classification codes, which contain three categories, namely "cheng", "zhen", and "xiang," corresponding to urban areas, towns, and villages, respectively. However, the scope of "cheng" or the scope of "cheng" and "zhen" is not the same as the scope of a municipal district (shixiaqu) defined in the statistical yearbook. Therefore, it is inappropriate to simply keep the data of "cheng" or to keep the data of "cheng" and "zhen" and use them as the data of municipal districts. On the other hand, if the difference in scope is ignored, directly combining "cheng" and "zhen" and treating the combined data as data of the municipal district, excluding the data of "xiang", a large number of samples would be discarded (53% in 2005, 51% in 2010 and 42% in 2015), which would lead to a shortage of the number of high-skilled workers in many cities, reducing the number of cities analyzed by nearly half and significantly affecting the accuracy of the empirical results.
When calculating the local good expenditure share, there are different choices regarding the scope of local goods, with some choosing to consider housing as a local good (Davis and Ortalo-Magné, 2011;Wang and Li, 2015) [40,41]. Others choose to consider both housing and nonhousing commodities as local goods Lewbel and Pendakur, 2009;Moretti, 2013) [23,42,43]. These choices were discussed in the Parameter Estimation section. The required data came from the census and the CMDS. I noted that there were only 106 cities in the 2010 CMDS survey. Therefore, for some parameters that need to be calculated using the 2010 CMDS data, I chose to calculate them by deflating the 2011 CMDS data with an index, such as the wage index.

Imputed city-skill level wages by the weighted method
For the empirical part of this study, I needed the average wages of the labor force with different skills in different cities in 2005, 2010, and 2015. However, wages are not counted in any census other than the 2005 census. Therefore, I could not directly calculate the average wage of the labor force with different skills in each city based on census data. To solve this problem, I used the weighted method of Fang and Huang (2022) [44] to calculate the average wages of different skills across cities by weighting the wages of different industries provided by the statistical yearbook with the number of high-and low-skilled workers in each industry provided by census data. In each city's statistical yearbook, the average wages of workers employed in different industries are counted; in the census data, information about the education and industry of the labor force is provided. Therefore, I could first obtain an individual worker's wage based on the average wage at the industry-city level where the worker was employed, and then calculate the city-level average wage by skill according to Eqs (1) and (2): where w ind,jt is the average wage of workers engaged in industry ind in city j in year t. H ind,jt is the number of high-skilled workers working in industry ind in city j in year t. H jt is the number of high-skilled workers working in city j in year t. w H jt is the average wage of high-skilled workers in city j in year t. Eq (2) focuses on low-skilled workers, and the specific meaning of each item is similar.
Although wage information is provided in the 2005 census data, in some cities, there is a gap between the wages of high-skilled/low-skilled labor obtained directly from the census data and the wages of high-skilled/low-skilled labor obtained using the weighted method. I used the average wages of employed workers provided by the China Urban Statistical Yearbook as the standard and used the wages calculated by the weighted method after a comprehensive comparison.
However, this method has shortcomings. Since census data is obtained by systematic sampling from raw data, when it is subdivided into j city ind industry to count the number of workers with different skills, the wage calculations in some cities were abnormal due to limited samples or biased sampling; for example, the average wage of high-skilled workers is lower than the average wage of low-skilled workers, the average wage of high-skilled workers is lower than the average wage of all workers in a city, the average wage of low-skilled workers is higher than the average wage of all workers in a city, and the average wage of high/low-skilled workers in the early years is higher than the average wage of high/low-skilled workers in the later years. Also, data on the average wages of subindustries in some cities are missing for some reason, so the average wage of high/low-skilled workers cannot be calculated. I regarded them as abnormal calculation results. After summarizing, the percentage of anomalous calculations of workers' average wage by skill is 12%, 15%, and 6% in 2005, 2010, and 2015, respectively. I used Variational Autoencoder (VAE) to recover these outliers.

Recovering city-skill level wages using a VAE
A VAE is a deep generative model that was first proposed by Kingma and Welling (2013) [45]. It is a generative network structure based on a Gaussian mixed model that uses variational Bayesian inference (Goodfellow et al., 2016) [46]. In the fields of economics and finance, due to its powerful data generation capability, VAE is widely used for data synthesis (Koenecke and Varian, 2020) [47], time series forecasting (Jin et al., 2022) [48], big data processing (Sarduie et al., 2020) [49], risk management and control (Arian et al., 2020) [50], stock index tracking   [51], education quality improvement   [52], etc.
Unlike a traditional autoencoder (AE) that describes a latent space by points, a VAE describes the observation of a latent space in the form of a probability distribution. Regularized encoding distribution ensures that it has good characteristics in the latent space, making data generation possible (Blei et al., 2017) [53]. Data processing using a VAE can be divided into four steps: first, the input is encoded as a distribution over a latent space; second, a point in the latent space is sampled from this distribution; third, the sampled point is decoded, and the reconstruction error is calculated; and finally, the reconstruction error is back-propagated through the network (Rezende et al., 2014) [54].
In practice, the process of generating data can be summarized as follows: 1. Input a data point x i to the encoder and obtain the parameters of the approximate posterior distribution q ϕ (z|x i ) obeyed by latent variable z through the neural network. It is generally assumed that the posterior distribution obeys a Gaussian distribution, so let the encoder output the parameters μ i and σ i (in practice the variance output log(σ 2 )) of the Gaussian distribution obeyed by q ϕ (z|x i ).
2. With parameters μ i and σ i , add the random variable ε i~N (0,1) and draw a z i from the corresponding Gaussian distribution, which represents a class of samples similar to x i .
3. Input a z i to the decoder, use the decoder to fit the likelihood distribution P θ (z|x i ), and let the decoder output parameters μ i ' and σ i ' of the Gaussian distribution obeyed by P θ (z|x i ).
4. After obtaining the parameters of the P θ (z|x i ) distribution, a sample from this distribution is used to generate possible data pointsx i . Fig 1 shows the basic structure of a VAE.
According to the central limit theorem, the average wage distribution of type-z workers in year t across all cities approximately obeyed a Gaussian distribution, where z 2 {H, L}; the employed workers' average wage distribution in year t across all cities also approximately obeyed a Gaussian distribution. After normalizing the two, they both approximately obeyed the standard Gaussian distribution. Then, I could make an a priori hypothesis. I assumed that for city j in year t, the quantile of z-type workers' average wage in the standardized distribution of the average wage of z-type workers' average wage across all cities is the same as the quantile of the average wage of the employed workers in city j in the standardized distribution of the average wage of the employed workers across all cities.
In this paper, both the input layer and the output layer are 1-dimensional, and I included nine hidden layers, with 60, 120, 72, 24, 1, 24, 72, 120, and 60 nodes, respectively. I used the LeakyReLU function as the activation function. The random seeds were set to control the randomness of the results (Nado et al., 2021;Chung et al., 2021) [55,56], and the learning rate was set to 0.001. According to Kingma and Ba (2014) [57], I used AdamOptimizer as the optimizer.

Main variables
I selected data on seventeen urban amenities that endogenously respond to the urban highskill employment ratio. Wages, rents, land use regulations, land unavailability, and amenities are the main variables used in this paper. Table 1 reports the summary statistics for the main relevant variables.

Stylized facts
Based on available data, I measured changes in urban skill composition, migration trend, and sorting trend, from which I drew spatial sorting characteristics. I then measured changes in inequality in nominal wages, rents, and real wages between high-and low-skilled workers. From these observations, I documented four stylized facts and drew inferences.

Fact 1: Increasing share of migrants and high-skilled labor
From here forward, I refer to the decennial census and the mini-census simply as "the census". Using census data from 2000 to 2015, I calculated the share of urban migrants and the share of high-skilled labor among migrants and residents, respectively, for each of the four survey years. Table 2 shows that, from 2000 to 2015, the share of migrants in each city gradually increased, and the growth rate of each five-year period also increased, which means that the share of migrants in each city grew increasingly faster. Simultaneously, the share of highskilled labor among migrants and residents both increased, with the share of high-skilled labor  Employment rate refers to the share of employed workers aged fifteen to sixty-five years in industries other than agriculture in each city. The land use regulation index measures the intensity of policies and regulations that restrict land use for housing development in each city. Following the spirit of Tao (2011) [58], Fan and Mo (2013) [59], I used the ratio of the average sales price of commercial and residential land to the average sales price of industrial land to measure the intensity of land use regulation. The land unavailability index measures the share of land that is unsuitable for housing development due to wetlands, lakes, rivers, and other internal water bodies and slopes exceeding 25% within 30 km of each urban center. The data required for this indicator were calculated using the 30- among migrants growing faster. Its average annual growth rate was 1.65 times higher than the average annual growth rate of residents' high-skill share. It is obvious that the migration trend gradually increased from 2000 to 2015 across cities. At the same time, the proportion of workers with a bachelor's degree or above across the country has increased. Also, the proportion of high-skilled workers who choose to live in cities other than their household registration cities is increasing.

Fact 2:
The sorting trend is increasing, with larger cities having higher wages, higher rents, and a smaller wage gap between high-and low-skilled labor (1) Sorting. Fig 2 shows how sorting trend has changed over time. The semi-elasticity in 2015 was 0.0224, indicating that for every 1% increase in city size in that year, the share of high-skilled labor in migrants would increase by 2.24%. From 2000 to 2015, the semi-elasticity of the urban high-skilled labor share with respect to city size nearly tripled, indicating a growing trend of labor sorting across cities.
(2) Wage premium. Fig 3 shows the urban wage premium, which measures the increase in nominal average wages as city size increases. From 2000 to 2015, the elasticity of the average labor wage with respect to city size remained positive, and the average urban wage premium was 0.123, indicating that the larger a city, the higher the nominal wage; for every 1% increase in city size, the average labor wage increased by about 0.12%.
(3) Skill premium. Fig 4 shows the urban skill premium, which measures the degree to which the ratio of wages for high-skilled workers compared with low-skilled workers increases as city size increases. Since there were some extreme values in the raw data, I trimmed the data by truncating the extreme values at the top and bottom 0.05% quantiles of the income distribution, respectively. That is to say, nearly 150 extreme values were removed, representing about 1‰ of the total data. The skill premium elasticities were negative for all survey years, indicating that large cities do not imply a greater income gap between high-and low-skilled labor. The skill premium elasticity for 2015 was -0.00168, indicating that for every 1% increase in city size that year, the ratio of wages for high-skilled workers compared with low-skilled workers decreased by about 0.17%.

Fact 3:
The share of high-skilled labor has increased more in small cities, and the wages have increased more in high-skill cities Moretti (2004), Berry and Glaeser (2005), Shapiro (2006), and Moretti (2012) [60][61][62][63] noted that U.S. cities with higher college employment ratios in the base year also experienced larger increases in college employment ratios, a polarization these researches referred to as the "Great Divergence." For comparison, Fig 6A shows that in China, the change in the high-skill

PLOS ONE
The role amenities play in spatial sorting of migrants and their impact on welfare employment ratio in the decade after the base year (2005) was negatively correlated with the high-skill employment ratio in the base year. This implies that the ratio of high-skilled

PLOS ONE
The role amenities play in spatial sorting of migrants and their impact on welfare employment in small cities had increased more over the decade, which means there was no "Great Divergence" in China.
Differences in skill mix across cities are strongly correlated with wages and living costs. Fig 6B  shows that there is a weak negative correlation between changes in average rents and changes in local high-skill employment ratios over the ten years. Fig 6C and 6D show that there is a positive correlation between the change in high (low) skill wages and the change in high-skill employment ratios with a coefficient of 0.045 (0.039), which means that for every 1% increase in the high-skill employment ratio change, high (low) skill wages would increase by 0.045 (0.039) percentage points. From the stylized facts above, one might ask: Why do wages increase more for all workers in high-skill cities? What is the mechanism? The empirical part of this paper uses structural and reduce-form equations to provide insight into the relationship between high-and low-skilled labor and within high-skilled/low-skilled labor to answer these questions.

Fact 4: The nominal wage gap between high-and low-skilled migrants has narrowed, the rent gap has widened, and the real wage gap has narrowed
In this paper, I used the approach of Mincer (1974) [64] to measure changes in inequality of wages, rents, and real wages caused by the spatial sorting of workers across cities via the elasticities of wages, rents, and real wages with respect to a worker's duration of education. I used the difference between the elasticity of wages for high-skilled workers and the elasticity of wages for low-skilled workers to represent the wage gap, the difference between the elasticity of rents for high-skilled workers and the elasticity of rents for low-skilled workers to represent the rent gap, and the difference between the elasticity of real wages for high-skilled workers and the elasticity of real wages for low-skilled workers to represent the real wage gap. If the year is extended to 2017, the changes in these three indicators remain stable. Taking it a step further, I sorted the cities in descending order of size and selected the top fifty large cities to calculate the real wage gap. I found that, except for 2013 and 2014, the real wage gap calculated using the top fifty large cities was smaller than the real wage gap calculated using all cities. It can be inferred that in most years, the real wage gap in large cities is smaller than the real wage gap in small cities. It can be seen that moving to a large city does not necessarily mean that the real wage gap will widen, so there is another factor besides wages and rents that is driving the increasing trend of sorting across cities.

Summary of stylized facts
This section presents four key facts about migration, housing costs, and changes in income inequality. These facts suggest that, as the economy continues to grow, workers are migrating to live in cities other than their household registration locations, and the proportion of high-skilled migrants is increasing. Migration is characterized by an increasing trend in spatial sorting; living in a large city earns higher wages but also requires paying higher rents, and the wage gap between high and low skills is smaller in large cities than in small ones. The proportion of highskilled workers in small cities has increased much more than in large cities. Wage growth is higher for all skilled labor in high-skill cities, and income inequality measured by the real wage gap between migrants with different skills is narrowing. A question naturally arises: Since moving to a large city does not necessarily mean that the real wage gap will widen, what factor other than wages and rents is driving the growing trend of worker sorting across cities?
Much of the literature on migration, such as Dudwick (2011), Mourmouras and Rangazas (2013), Xia and Lu (2015), and Liu and Wei (2019) [65][66][67][68] make a similar point: the availability of amenities is an important factor that drives labor migration. How do urban amenities affect the location choice of the labor force? Do changes in urban amenities alter the spatial sorting patterns of workers? To clearly describe the mechanisms by which urban amenities affect spatial sorting and to quantify the impact of urban amenities in this process, this paper required causal estimates of labor migration elasticities and the specific characteristics of cities. The impact of changes in the number of high-and low-skilled workers on wages, rents, and amenities depends on the elasticity of the local housing supply, local labor demand, and amenity supply into which I delved. Furthermore, using the utility microfoundation of workers' city choices, migration elasticities can be mapped onto utility functions, and the estimated parameters can be used to quantify the welfare effects of changes in wages, rents, and amenities. To measure how these supply elasticities and demand elasticities interact and ultimately lead to equilibrium outcomes, I used structural models to explore these questions in depth via techniques such as conditional logit estimation, general moment estimation, and counterfactual simulation.

Urban spatial equilibrium model
A spatial equilibrium model is presented in this section. The setup of the model follows the main idea of Diamond (2016) [1] and adds to it the feature of migration costs. The model assumes that labor preferences, urban productivity, and urban housing supply are heterogeneous. Local productivity and amenities are set to respond endogenously to the skill set of local workers. This section details the following settings: labor demand, housing supply, labor supply, amenity supply, and how these items together determine spatial equilibrium across cities.

Labor demand
In this paper, subscript j is used to represent a city, and subscript d is used to represent a firm. Each city j has many homogeneous firms in year t. These firms use high-skilled labor H djt , low-skilled labor L djt , and capital K djt to produce homogeneous tradable goods. The form of the production function is: y The total amount of labor N djt and capital K djt in the production function are in Cobb-Douglas form. The total amount of labor employed by each firm is denoted as N djt and consists of high-skilled labor H djt and low-skilled labor L djt in the form of imperfect substitution. The elasticity of labor substitution is 1/(1-ρ), and the constant parameter ρ does not change over time.
The differences in urban production functions are reflected in the heterogeneity of urban productivity. The productivity of high and low skills in each city is measured by y H jt and y L jt , respectively. Eqs (5) and (6)  Assume that there are a large number of firms and that there are no barriers to entry into the market, so the labor market is perfectly competitive, and the wage paid by firms to hire labor is equal to the marginal product of labor. Assume that the capital market is frictionless, the supply of capital is completely elastic, and the price of capital is the same across cities, denoted as k t . The demand for labor and capital by each firm can be written as follows:

PLOS ONE
The production function of firms has constant returns to scale and uses the same production technology, so the firm-level labor demand can be directly translated into the city-level aggregate labor demand. Substituting the equilibrium capital level, the logarithm of labor demand at the city level can be written as follows: The above equations show that labor supply affects wages via imperfect substitution of high-and low-skilled labor within firms (controlled by ρ) and changes in urban productivity (controlled by f H (H jt , L jt ), f L (H jt , L jt )). In estimating these equations, the only way to distinguish between the effects of endogenous productivity and imperfect labor substitution on wages is to parameterize f H (H jt , L jt ), f L (H jt , L jt ) through a strong assumption. Instead of imposing parametric constraints, the labor demand equation can be written as an unknown function of employment level (H jt , L jt ) and exogenous productivity ðε H jt ; ε L jt Þ: where g H (H jt , L jt ) and g L (H jt , L jt ) represent the combined effects of imperfect labor substitution and endogenous productivity. Using log-linearized total labor demand to estimate these functions, the equations can be rewritten as follows:

Urban labor supply
Let the subscript i denote workers in each household, and assume that these workers choose to live in a city that offers them the most attractive wages, local good prices, and amenities. The wage of high-skilled labor differs from that of low-skilled labor in each city. The wage earned by a worker with education level edu who resides in city j in year t and who inelastically provides one unit of labor is recorded as W edu jt . The worker consumes the local good M and the national tradable good O, the price of the local good is denoted as R jt , and the price of the tradable good is denoted as P t . Also, the worker derives utility from the city's amenity A jt . The worker has a Cobb-Douglas preference for local and tradable goods, and he or she maximizes his or her utility subject to budget constraints: The relative preference of workers for local and tradable goods is controlled by z, where 0 � z i � 1. The optimal utility function of a worker can be represented by the indirect utility function of living in city j. If a worker resides in city j in year t, his utility V ijt is: The prices of tradable goods were measured in 2015 prices using the CPI index. From the worker's optimal utility function, his or her local good demand HD ijt can be deduced as follows: Workers' preferences for local non-market amenities are heterogeneous. This paper follows Diamond's (2016) [1] definition of amenity, which refers to all characteristics that can affect the attractiveness of a city, other than local wages and prices. This includes local social security programs, urban infrastructure, public services, and some natural conditions, such as rainfall. In this paper, the vector x A jt was used to represent the exogenous amenities of city j in year t. It did not respond to the endogenous variables in the model. Workers made a single-index evaluation value a jt for the urban amenity bundle. The key feature of a jt is that it responds endogenously to the share of high-and low-skilled labor in a city.
Function s i (A jt ) maps the urban amenity vector A jt to the utility value of a worker. The estimated value of amenity A jt for worker i is: b prov i and b region i measure the utility value of worker i living in a city in the province to which his or her household registration belongs and a city in the region to which his or her household registration belongs, respectively.
According to the "National Standard Citizen ID Card Number of the People's Republic of China (GB11643-1999)" and the "Administrative Region Code of the People's Republic of China (GB/T 2260-2007)", the regions to which cities belong are classified according to the first two regional codes of the national ID number. Taiwan, Hong Kong, and Macau are not included, and the specific correspondence is as follows: North (1): Beijing (11), Tianjin (12), Hebei (13), Shanxi (14), and Inner Mongolia (15); Northeast (2): Liaoning (21), Jilin (22), and Heilongjiang (23) Þ are all functions of worker's demographic grouping z i . z i is a 2 × 1 dummy variable vector, which represents the skill level of the labor and whether the labor is a cross-provincial migrant. The coefficients (β x , β a , β prov , β region , and β σ ) are all 1 × 2 vectors that measure the utility value of urban characteristics for a given demographic group. x prov j is a 1 × 30 binary vector that takes the value of 1 if the city in which worker i lives belongs to province k. Similarly, x region j is defined as a 1 × 7 binary vector that takes the value of 1 if the city in which worker i lives belongs to region m. prov i is a 30 × 1 binary vector that takes the value of 1 if worker i's household registration belongs to a province. reg i is a 7 × 1 binary vector that takes the value of 1 if worker i's household registration place belongs to a region. Each worker also has an individual heterogeneous preference for urban amenities, measured by ε ijt . ε ijt obeys a type-I extreme value distribution. The variance of workers' heterogeneous preferences for each city varies across demographic groups.
I normalized the utility function by dividing the utility of each worker by β σ z i . The indirect utility of worker i in city j is: For a given city, differences in preferences across workers of the same demographic group z are caused by workers' household registration provinces and their household registration regions (prov i , reg i ) as well as their heterogeneous preferences for the city, ε ijt . I defined the utility component of city j that is common to all type-z workers as d z jt : The indirect utility function can be rewritten as follows: This setup is consistent with the conditional logit model (McFadden, 1973) [4]. The difference in the total number of type-z workers across cities represents the difference in the average utility estimates of workers for these cities. The expected total population of city j is equal to the probability of each worker living in this city, summed over the entire population. The probability that worker i will choose to live in city j is: Therefore, the total number of high-and low-skilled workers in city j is: where C H t and C L t represent the set of high-and low-skilled workers in the country, respectively.

Housing supply
The local good prices R jt are determined when the housing market is in equilibrium. The local price level represents the prices of local housing and local composite goods, such as groceries and local services. The price of local composite goods is also affected by local housing prices.
The inputs used to build housing include construction materials and land. In each city, a developer is the representative of local landowners. The developer is a price taker and sells homogeneous housing at marginal production costs.
Local construction costs CC jt and local land costs LC jt are mapped to the marginal cost of building a house by the function MC(CC jt , LC jt ). There is no uncertainty, and the price is equal to the present value of rent in steady-state equilibrium. Local rents can be written as follows: where ι t is the interest rate. Houses are owned by absentee landowners, who rent them to local residents. Land cost LC jt is a function of the aggregate demand for local goods. Eq (22) shows that households increase their demand for local goods when wages rise or local good prices fall. A large number of migrants also increase the demand for housing. Parameterizing the logarithmic housing supply equation, I can obtain: where HD jt is the total demand for local goods in city j in year t. The elasticity of rent with respect to local good demand varies across cities, measured by g j : x geo j measures the share of land within 30 km of each urban center that is undevelopable due to slopes exceeding 25% and inland water bodies such as wetlands, lakes, rivers, etc. γ geo measures how changes in expðx geo j Þ affect the inverse elasticity of housing supply γ j . Local land use regulations have a similar effect via policies that restrict housing development. Smaller values of the land use regulation indicator imply more permissive policies toward real estate development. γ regulation measures how changes in expðx regulation j Þ affect the inverse elasticity of housing supply γ j . γ measures the elasticity of basic housing supply elasticity when a city has no land use regulatory policies and no geographic constraints that limit housing development.

Amenity supply
In this paper, I used x A jt to denote exogenous amenities and a jt to denote amenities that respond endogenously to the type of labor that lives in a city. I allowed the endogenous amenity index to be determined by the urban high-skill employment ratio H jt L jt : where γ a is the elasticity of amenity supply and ε a jt is the exogenous component of the amenity index a jt . All amenities in a city are represented by the vector A jt :

Equilibrium
The equilibrium of the model is determined by wages, rents, amenity levels ðw L� t ; w H� t ; r � t ; and population ðH � jt ; L � jt Þ; therefore, high-skilled labor demand equals high-skilled labor supply: low-skilled labor demand equals low-skilled labor supply: housing demand equals housing supply: endogenous amenity demand equals the endogenous amenity supply:

Model estimation
In this section, I constructed the endogenous amenity index a jt and established the instrumental variables needed to solve the endogeneity problem.

Endogenous amenity index
A city's amenity index should ideally capture the full range of amenities that endogenously respond to the city's skill mix. To measure urban amenities as broadly and comprehensively as possible, I collected data on seventeen different amenities and classified them into seven categories: financial institutions, transportation infrastructure, education quality, job market, cultural heritage, natural environment, and health care. This paper used AE to extract a single (one-dimensional) amenity index a jt for each city. Some amenity categories have more data sources. Since dimensionality reduction of highdimensional data puts more weight on the amenity categories with more data sources, I first created an amenity category index using the data within each category and then used all amenity category indices to create an overall amenity index, as detailed in Table 3.
The  [72], its learning capability is very limited. PCA is not ideal for the dimensionality reduction of complex data because, in reality, there are many nonlinear relationships between high-dimensional data features, and linear projection is no longer applicable, requiring the use of some nonlinear dimensionality reduction methods (Schölkopf et al., 1998) [73]. If the single-layer neural network is transformed into a multi-layer neural network, the linear activation function is replaced by a nonlinear activation function, and the irrelevant constraints between the dimensions of the transformed data are removed, then the PCA is converted into an AE with a more powerful learning capability.

Bartik labor demand shock
When the explanatory variables were endogenous, I used the Bartik instrumental variable, which is commonly used in the literature, to solve the coherent estimated coefficients of the explanatory variables. Changes in industry productivity levels within each city are a component of changes in urban productivity (Bartik, 1991) [6]. According to the different industry compositions of high-and low-skilled labor, changes in industry productivity will have

PLOS ONE
The role amenities play in spatial sorting of migrants and their impact on welfare different effects on a city's local high-and low-skill productivity. I measured exogenous local productivity changes through the interaction between cross-sectional differences in industry employment composition and changes in high-and low-skill wages across industries in the country. Accordingly, this paper defines the Bartik shock for high-and low-skilled labor as follows: where w H ind;À j;t and w L ind;À j;t represent the logarithmic average wages of high-and low-skilled labor in industry ind in year t, respectively, excluding the labor force in city j. H ind,j,2005 and L ind,j,2005 represent the number of high-and low-skilled labor employed in industry ind in city j in 2005, respectively. These Bartik labor demand shocks are part of a city's exogenous productivity changes over time. Specifically, the exogenous high-and low-skill productivity changes in Eqs (16) and (17) can be written as follows: Δε

Labor demand
The amount of labor is a function of local productivity and wages. Differentiating a city's wage from its base year level yields the following: Δw Substituting Bartik labor demand shock into the labor demand equation, I got: Δw The direct effect of the Bartik shock is to shift the local labor demand curve, directly affecting local wages.

Housing supply
The change in the housing supply curve after 2005 is as follows:

Labor supply
The indirect utility of labor i with demographic grouping z i in city j is: I used a two-step estimation method similar like Berry et al. (2004) [5] to estimate labor's preference for cities. First, I used conditional logit regression to obtain maximum likelihood estimates, in which I estimated the average utility value d z jt for each demographic group in each city every five years. The second step was to decompose the average utility values into laborrelated assessments of wages, rents, and amenities. Differentiating the urban average utility estimates of labor in demographic group z relative to the base year level yields: Define Δx z jt as the change in unobservable exogenous amenities in the form of utility values in demographic group z of city j: Substituting this into Eq (62), I got:

Amenity supply
Differentiating the amenity supply relative to its 2005 level yields: I estimated all parameters jointly using a two-step GMM method, and standard errors were clustered by city in all estimation equations. All equations contained five-year fixed effects to incorporate national changes over time.

Migration cost
Researchers have shown that migrants generally need to face two challenges in terms of access to urban amenities when choosing a target city to settle in: On the one hand, there is limited access to urban resources brought about by the threshold of household registration Lu, 2016) [74,75]. On the other hand, due to the limited urban resources previously planned, as population migrates into a city, increasingly more permanent residents compete for the use of urban resources; this causes a decrease in the level of urban per capita resource ownership, which leads to a shortage of urban infrastructure and public services as well as congestion and other "urban diseases" (Lu, 2016) [75]. Therefore, instead of emphasizing the relocation cost directly related to distance or the cost of living directly related to the prices of local goods, the setting of migration costs in this paper focuses on describing the situation where migrants do not have full access to and enjoyment of all urban amenities due to barriers of household registration threshold and the fact that the resident population exceeds the resource carrying capacity of urban infrastructure and public services; therefore, migrants suffer some loss of utility. Following Tombe and Zhu (2019) [76] who constructed the migration costs arising from interprovincial and intersectoral mobility of labor as utility costs, this paper also sets the measured inter-city migration cost as utility costs. Now, I extend the base model. If migrants choose to live in city j, workers can only enjoy some of the urban amenities due to limited access to them. I set this urban amenity distortion with the migration cost τ z,jt as the core variable, that is, as a local tax rate levied on type-z migrants living in city j, which will affect the utility of the migrants: where z 2 {H, L} and τ z,jt � 0. The utility component d z jt that is available for living in city j and is common to all laborers of type-z becomes: for high-skilled labor: and for low-skilled labor: The indirect utility equation that takes migration costs into account is: With such a model setting, the higher the migration cost τ z, jt , the lower the utility of migrants from urban amenities and vice versa. When τ z,jt = 0, there is no status difference between migrants and the registered population in city j, and the planned urban amenities can satisfy the needs of all residents, so access to urban amenities is not restricted, and the local tax rate on migrants' utility is zero.  [1,23,43,77] and considered the additional impact of housing prices on non-housing goods. Thus, I calculated the price of local goods in two cases. One is to treat housing as a local good only, and the other is to treat non-housing goods together with housing as a local good. The only indicator in the CMDS that belongs to the category of non-housing goods and provides an average monthly expenditure is local food. Therefore, in the second case, local goods expenditures consist of both local housing expenditures and local food expenditures.

Local good expenditure share
I also selected the databases used in the calculation of local good expenditure share. The target year for calculating the local good expenditure share in this paper is 2015, and the database should be able to provide data on non-housing commodity expenditures. In addition to the CMDS, other possible options include the Chinese Household Income Project Survey (CHIP), the China Family Panel Studies (CFPS), and the China Household Finance Survey (CHFS). First, the sample sizes of these datasets are much smaller than the size of the CMDS; thus, the number of migrants in each of these datasets is too small for this paper. Second, each of the three datasets had other shortcomings that could not meet the needs of this study. CHIP was excluded because no survey was conducted in 2015, and the adjacent available year is 2013, but using 2013 data as a proxy would cause large errors that would affect the accuracy of the results. The CHFS was excluded because in 2015, it provided expenditure data on consumption expenditure, property expenditure, business expenditure, social security expenditure, and transfer expenditure; however, according to the definitions of each expenditure provided, its statistical scope differs significantly from the needs of this paper. The reason for excluding the CFPS is that no survey was conducted in 2015, and the number of available samples after screening was less than 2000 in 2014 and 2016. In contrast, the CMDS has the following advantages: A survey was conducted in 2015, the sample size was sufficient (about 190000 available samples), and the required data were provided (i.e., food expenditure, housing expenditure, and total expenditure were provided). After a comprehensive comparison, I used the CMDS to calculate the local good expenditure share.
To avoid outliers in the calculation that might be affected by economic fluctuations in a given year, I calculated the local good expenditure share from 2013 to 2015. To assess whether these expenditure shares were due to different average prices faced by laborers with different skills, I further controlled for labor-skill levels as well as the size of the city in which these households were located. City size was divided into five classes according to the Notice on Adjusting the Criteria for Classification of City Size issued by the State Council. Table 4(A) reports the local good expenditure share made up of housing expenditure only: The average housing expenditure for high-and low-skilled workers in 2015 was 24.65% and 18.36%, respectively. Table 4(B) reports the local good expenditure share consisting of food and housing expenditures, in which case there is no significant difference in the local good expenditure share for different labor-skill levels. In combination with the results of the regression coefficients and statistical averages, I set the local good expenditure share at 0.63.

Migration costs
I used the product of two parameters λ and μ z,jt to represent the migration cost τ z,jt : τ z,jt = λ � μ z,jt , where μ z,jt denotes the migration costs across cities calculated in this paper with reference to methods in previous literature, and λ denotes the coefficient that adjusts the calculation results according to the lower bound of utility that migrants can afford to make urban location choices.
In this paper, I used the urban household registration threshold index multiplied by the migrant-ratio index to measure the migration cost μ z,jt incurred by migrants due to a lack of local household registration and limited access to urban resources when they live in a different city. I used the household registration threshold index constructed by Zhang and Lu (2019) [78] to measure migrants' limited access to urban resources due to the household registration threshold. The original index included only 120 cities, so I calculated the average household registration threshold for each type of city according to city size and filled in the cities with missing data.
I used the migrant-ratio index constructed by Han and Lu (2018) [79] to measure the gap between the planned and actual number of users of urban infrastructure, social security, public services, and other resources: migrant ratio = (resident population-registered population) / registered population. To eliminate the interference of a negative migrant ratio in the calculation of migration costs, the calculation results of the migrant ratio were normalized. Previous literature has reached slightly different conclusions on how much productivity gains can be achieved by eliminating labor market distortions, but they are around 20%. Pan et al. (2018) [80] [83] calculated the result as 21.6%. In this paper, I set this value to 20%, which means that eliminating labor market distortions can increase productivity by 20%. According to Tombe and Zhu (2019) [76], when land is not used as an input factor in the production process, the proportional relationship between the labor productivity improvement and the welfare improvement driven by the reduction in domestic migration costs is 1:1.58; that is, eliminating labor market distortions, a 20% increase in productivity can increase the level of labor utility by up to 31.6%. That is to say, labor market distortions reduce not only the level of productivity but also the level of labor utility, which is only 75.76% of what it would be in the ideal case without distortions. Based on this finding, I set the lower bound on the utility that migrants can accept in their city of residence to 75% of the utility level in the undistorted case. If the migration cost τ z,jt is too high, the local tax rate on the utility received by a migrant living in city j will be significantly higher. Once the level of utility available to the migrant is below the lower bound of utility, he or she will make a new choice: whether to apply for local household registration and become a registered population or choose to live in another city. According to the above setting, the value of λ was set to 0.012 by combining the migration cost μ z,jt across cities.

Labor supply
This paper presents parameter estimates for four specific forms of the model to highlight the importance of endogenous amenities and productivity in influencing migration, wages, and rents from 2005 to 2015. I refer to Model (1) as the "standard model", assuming that local amenities and firms' local productivity levels are exogenous and that the elasticity of local demand is determined only by the labor substitution elasticity ρ between high and low skills, that is, estimate labor demand Eqs (10) and (11), where f H (H jt , L jt ) = 0, f L (H jt , L jt ) = 0. Households' local good expenditure share is not calibrated to highlight how labor trades off between wages and local prices when amenities are assumed to be exogenous. The estimation results of Model (1) are shown in Column 1 of Table 5. The results show that in Model (1), high-skilled workers prefer high wages and low rents, while low-skilled workers have a positive demand elasticity for rents, so under the same conditions, the real wages of low-skilled workers are lower than those of high-skilled workers. High-and low-skilled workers do not have the same trade-off between wages and rents, which implies a difference in their local good expenditure share. If only housing expenditures are considered when calculating the local good expenditure share, then according to the setup of Model (1), high-and low-skilled workers are willing to spend about 12.6% and about 27.8% of their expenditures on local goods, respectively. The estimation results suggest that the difference in skill mix across cities is because the local good expenditure share of lowskilled labor is more than twice as large as the share of high-skilled labor. However, the parameter estimate of the 12.6% expenditure share is rejected by the CMDS. Table 4 shows that when only housing expenditures are considered, the local good expenditure share of high-skilled workers, which is also the lower bound of all local goods consumption, is about 24.7%; the large gap between the high-skill local good expenditure share and low-skill local good expenditure share estimated by Model (1) is also rejected by the CMDS. The main difference between the two calculations is that the high-skill local good expenditure share estimated by the CMDS is higher than the share of low-skilled labor, while the high-skill local good expenditure share estimated by Model (1) is lower than the share of low-skilled labor. Model (2) adjusts the local expenditure share to 0.63 based on Model (1) and estimates only the elasticity of labor migration with respect to wages. I call it the "restricted standard model." The estimation results are presented in Column 2 of Table 5. The estimation results show that the wage elasticity of high-skilled labor decreases to 48.2% of the wage elasticity in Model (1), while the rent elasticity increases to 2.42 times that in Model (1). That is, when the high-skill local good expenditure share is calibrated from 25% to 63%, the utility gain from higher wages will be nearly halved, while the utility reduction from higher rents will be 1.4 times greater. Large cities can provide higher wages, but they also have higher living costs. Obviously, for high-skilled labor, the utility reduction from choosing a large city with a higher cost of living is greater than the utility increase from higher wages. This echoes the inference of the stylized facts that there is a factor other than wages driving the strengthening of the spatial sorting trend. This factor should be positively correlated with local prices affected by the Bartik shock and housing supply. Changes in amenities could explain this puzzle. I tested the over-identification constraint, and the test results for Model (1) and Model (2) (p-values of 0.3233 and 0.2059, respectively) accepted the null hypothesis of the over-identification test that all instrumental variables are exogenous, there is no over-identification problem. This further supports my inclusion of the endogenous amenity variable in the model.
The third column of Table 5 presents the estimation results of Model (3). The local good expenditure share in Model (3) remained at 0.63, adding urban endogenous amenities and relaxing the constant substitution elasticity function form of land demand, allowing for a more flexible model of labor demand. I refer to this as the "full model." According to the estimates, both high-and low-skilled workers prefer higher wages, lower rents, and higher levels of amenities. However, there is also heterogeneity in preferences between high-and low-skilled workers, with the key difference being how they value wages, amenities, and the relative value of real wages versus the level of amenities. The migration elasticity with respect to wages for high skills (1.891) is larger than that for low skills (0.306), indicating that high skills are more sensitive to changes in wages. Similarly, high-skilled workers are also more sensitive to the level of amenities (0.092 > 0.064), possibly because high-skilled workers are more capable of breaking through migration barriers and choosing a new city to settle in than low-skilled workers, and therefore they are also more attentive to the key characteristics of the new city. Model (3) also passes the test of the over-identification constraint, and the instrumental variables do not have an over-identification problem. The endogenous amenity index added to the model captures previously overlooked variables.
The fourth column of Table 5 presents the estimation results of Model (4), which removes the assumption of 0.63 for the local good expenditure share and attempts to identify this parameter from the census data. The estimation results of Model (4) are noisier due to the correlation between housing rent and amenities, but the main conclusion that a high-skilled worker prefers higher wages, lower rents, and higher levels of amenities still holds. Since only monthly housing expenditures are available in the census data, using it as the local good expenditure yields a share of about 18.8% for high-skilled labor and 17.3% for low-skilled labor. This result is essentially the same as the estimate of 23% in the literature (Wang and Li, 2015) [41] and in my calculations based on CMDS data.
The bottom half of Table 5 reports the heterogeneity of preferences for labor migration across provinces. Overall, compared to the base regression results, interprovincial migrants face lower real wages, while high-skill interprovincial migrants have higher amenity elasticity than intra-provincial migrants; that is, high-skill interprovincial migrants are more concerned with the level of amenities in their city of residence. Table 6(B) presents the estimates of inverse housing supply elasticity. The overall level of my estimates was determined by the base inverse housing supply elasticity term γ. The mean value of the inverse elasticity of the base housing supply is 0.548, with a standard deviation of 0.019. The estimates of the inverse elasticity of the base housing supply do not differ significantly across the four model specifications, which is not surprising since they all share the same housing supply model. Consistent with the work of Fan et al. (2015) [84] and Lu et al. (2015) [85], the results in Table 6(B) show that rent increases are higher in cities with higher land regulation and higher in cities with higher land unavailability. In other words, housing supply elasticity is lower in areas with higher levels of land use regulation and in areas with a lower share of land available for real estate development.

Labor demand
The parameter estimates of the local labor demand curve are presented in Table 6(C). The estimated ρ for Model (1) is 0.947, which implies that the labor elasticity of substitution is 18.87. The estimated ρ for Model (2) is 0.914. The parameter estimates are consistent with those of Zhao and Yuan (2017) [86]. A higher elasticity of labor substitution indicates that there is an imperfect substitution relationship between high-and low-skilled labor (Card, 2009) [28], and that technological progress favors high-skilled labor.
Models (3) and (4) estimate more flexible labor demand curves [56] and [57]. Based on the results in Column 3 of Table 6(C), I can first reject the assumption that the elasticity of highskilled labor demand with respect to high-skill wages is equal to the elasticity of low-skilled labor demand with respect to low-skill wages. In the standard CES production function commonly used in the literature, these two elasticities are often assumed to be the same. Second, the sign of the elasticity of high-skill wages with respect to high-skill employment is negative but insignificant, suggesting a powerful knowledge spillover between high skills in addition to the competitive relationship. This force contributes to the productivity of all workers, thus increasing their wages in areas with high concentrations of high skills. The sign of the elasticity of low-skill wages with respect to low-skill employment is negative and significant at the 5% level, indicating that there is no strong knowledge spillover between low skills but rather a competitive relationship between them. The elasticity of high-skill wages with respect to lowskill employment is positive but insignificant, and the elasticity of low-skill wages with respect to high-skill employment is positive and significant at the 1% level, indicating that there is skill complementarity between high-and low-skilled labor and that the gains from skill complementarity are more significant for low-skilled labor. This is intuitive because high skills may be subject to low-skill shocks, resulting in a slight decline in high-skill productivity; this would affect high-skill earnings. In general, competitive relationships dominate same-skilled labor, while complementary relationships dominate high-and low-skilled labor. The elasticity estimates in Column 4 also support these findings.

Amenity supply
The elasticity of the amenity supply with respect to the high-skill employment ratio is presented in Table 6(D). A positive elasticity implies that an increase in the ratio of high-skill employment endogenously improves local amenities in cities. However, the relationship is not significant, indicating that growth in the level of amenities in large cities has not been able to match the spatial sorting trend of migrants in China, which suggests that there are differences in the endogenous mechanism of amenity supply between Chinese and American cities. Since high-skilled workers earn higher wages than low-skilled workers, they have greater abilities to choose locations with high levels of amenities, and their strong demand for amenities will also lead to an increase in the level of amenities in areas with a high concentration of high-skilled workers. This mechanism has been validated in a large amount of literature using U.S. data (Bayer et al., 2007;Guerrieri et al., 2013;Handbury, 2021) [15,18,19]. However, in China, this story is slightly different. First, one of the objectives of the Chinese government's poverty alleviation policies and transfer payment policies is to ensure basic living conditions of residents in less-developed areas and small cities; the implementation of such policies and measures has significantly improved the level of amenities in less-developed areas and small cities. Second, the construction of some amenities, such as urban infrastructure, is based on historical projections of population growth, and the low mobility of population in the early years made historical projections greatly underestimate the actual population growth in large cities. The supply of some amenities in large cities was not designed with a sufficient margin for an increasing population in the future. Consequently, the development of large cities is often accompanied by the problem of "urban diseases." Third, some amenities, such as social insurance and public services, rely mainly on local financial support, and limited fiscal revenues will also reduce the growth rate of the amenity level in large cities. Finally, China's household registration policy restricts migration, and the trend of migrants' spatial sorting is suppressed, which directly weakens the increase in amenity supply led by high-skilled workers' demand in large cities. This also results in many high-skilled workers failing to obtain household registration in large cities, so they will save a portion of their income earned in large cities to pay for their future living expenses in small cities, which indirectly reduces the demand for amenities among highskilled workers in large cities. For these reasons, an increase in the amenity supply in China is not fully dominated by high-skilled workers' demand for amenities, and this resulted in a positive but insignificant regression coefficient.

Urban amenity and productivity
The exogenous productivity of local firms and the attractiveness of local amenities in each city can be inferred using the estimated results of the model parameters. Much of the literature has used hedonic techniques to estimate which cities provide the most desirable amenities. In this paper, I used a different method to infer the level of amenities in each city. Recalling Eq (64), the utility value of a city's amenities to the labor of a given demographic group is measured as a component of the common utility level of labor in each city, which is not controlled by local wages and rents. Therefore, Amen z jt , the utility from amenities of type-z labor in city j in year t can be written as follows: where Z z jt ¼ ð 1 1À t z;jt ðd Z jt Z jt Þ À t z;jt Þ À 1 . Given the wage and rent levels in cities and the labor's preferences for wages and rents, it can be intuitively inferred that cities with higher-than-expected population levels for specific demographic groups have higher levels of amenities. Similarly, it is possible to analyze which cities have the highest and lowest productivity levels.
Through the regression of the model-predicted change in urban high-skill productivity and the model-predicted change in low-skill productivity, I found that the regression coefficient of 0.025 for the per capita level of local high-skill productivity change and local low-skill productivity change was not significant, and the R 2 was low, indicating that there is only a weak positive relationship between the two, that is, there is a huge difference between local high-skill productivity and low-skill productivity changes. Table 7 shows that the regression coefficient of 0.604 for the change in high-skill wages and the change in low-skill wages was significant at the 1% level, indicating a strong positive correlation between the two with an R 2 of 0.343, which means that the change in low-skill utility due to the change in wages explains about 34% of the change in high-skill utility due to the change in wages in the same city.
Note that simply comparing the relationship between changes in local high-skill wages and changes in local low-skill wages is unlikely to reveal a weak positive relationship between local high-skill productivity changes and low-skill productivity changes, with the movement along the local labor demand curve driven by migration masking the large differences in local productivity changes between different skills.
The preferences of high-and low-skilled workers for urban amenities are relatively close. In general, the overall utility valuation of urban amenities by high-skilled labor is positively correlated with the utility valuation of the same urban amenities by low-skilled labor. According to the results presented in Table 7, the change in the utility value of high-skill amenities and the change in the utility value of low-skill amenities across cities are strongly positively correlated, regardless of whether the amenities are endogenous or exogenous. The difference mainly lies in the magnitude of the coefficients. For every 1% increase in the utility of endogenous amenities for low-skilled labor, the utility of endogenous amenities for high-skilled labor increased by 1.698%; for every 1% increase in the utility of exogenous amenities for low-skilled labor, the utility of exogenous amenities for high-skilled labor increased by 0.210%. This implies that high-skilled labor will gain more utility from changes in endogenous amenities than lowskilled labor, while high-skilled labor will gain less utility from changes in exogenous amenities than low-skilled labor. This result confirms, from a utility perspective, that endogenous amenities are an important force driving the spatial sorting of high-skilled labor. The R 2 results show that the change in low-skill utility due to endogenous amenity changes in the cities explains 35.3% of the change in high-skill utility for the same cities' amenities; the change in low-skill utility due to exogenous amenity changes explains 35.6% of the change in high-skill utility for the same cities' amenities.
Migration costs weaken labor mobility and reduce the spatial sorting trend of migrants. If migration costs increase, migration will be more difficult and the number of migrants will decrease, but the problem of "urban diseases" in large cities due to excessive resident populations will be alleviated to some extent. If migration costs decrease, according to Eqs (66) and (67), a decrease in τ z,jt leads to a decrease in Z z jt and an increase in d z jt 0 . After substituting specific values into Eq (71), it was found that a decrease in τ z,jt leads to an increase in Amen z jt , and simultaneously, the number of migrants will increase; that is to say, more high-and low-skilled workers will be able to obtain household registration in large cities and, thus, have higher levels of amenities. However, if the population exceeds a city's carrying capacity, the resident population will compete more fiercely for urban resources. Like high-skilled workers, low-skilled workers also prefer higher levels of amenities, but cities with higher levels of amenities tend to have higher migration costs, and high-skilled workers have a greater ability to break through migration barriers than low-skilled workers, so migration costs have a relatively greater impact on the migration of low-skilled workers. A fraction of workers who forgo being registered in a large city due to high migration costs tend to make a new location choice and choose to live in a small city, while others tend to choose to work in a large city when they are young and relocate to a small city in the future.

Impact of registered population on location choice
Distribution of the registered population affects workers' location choice from three perspectives. First, the proportion of migrants to the resident population is relatively small in the majority of cities in China, and the registered population is the main component of the resident population. The larger the registered population, the larger the city size tends to be. According to Eqs (66) and (67), the larger the Z jt , the larger the Z z jt , and the smaller the d z jt 0 , the fewer indirect utilities migrants can have, which means that the registered population affects migrants' indirect utilities through the city size channel and, thus, has an impact on workers' location choices. Second, the distribution of the registered population has an impact on mobility costs: According to the migrant-ratio formula in Section 6.2, the higher the proportion of the registered population to the resident population, the lower the proportion of migrants to the resident population in the city, and the lower the migrant ratio, the lower the migration costs μ z,jt . This means that the registered population influences workers' location choices via the channel of competition intensity among migrants. Third, the household registration threshold index Zhang and Lu (2019) [78] used in this paper was quantified by city size hierarchy. Since the registered population is the main component of the resident population in the majority of cities in China, generally speaking, the larger the registered population, the larger the city size tends to be, and the larger the value of the household threshold index, the higher the difficulty of moving in and settling down. That is, the registered population has an impact on workers' location choices via the urban settlement threshold channel.

Determinants of urban high-skill employment ratio changes
I used reduced-form regression between exogenous productivity changes estimated from the model and high-skill employment ratios to assess the role of local productivity changes in driving local migration patterns. The regression equation is as follows: According to the figures in Column 1 of Table 8, changes in high-skill exogenous productivity strongly predict increases in high-skill employment ratios, while changes in low-skill exogenous productivity strongly predict decreases in high-skill employment ratios. Furthermore, the R 2 of this regression suggests that 42% of the changes in the urban high-skill employment ratio can be explained by changes in local productivity.
I evaluated the predictive effect of model-inferred exogenous amenity changes Δx z jt on highskill employment ratio changes for comparison. According to the figures in Column 2 of Table 8, exogenous amenity changes negatively predict high-skill employment ratio changes, but the explanatory power is not very high, with an R 2 of only 0.102. I combined exogenous amenity changes and exogenous productivity changes in the same regression, the results are presented in Column 3 of Table 8. Similarly, exogenous productivity changes strongly predict high-skill employment ratio changes. Compared to the regression using only productivity changes, the R 2 of the regression including exogenous amenity changes increased by only 0.014. Thus, it can be seen that local productivity changes are the main driver of urban highskill employment ratio changes. The next question is whether endogenous amenity changes are a key channel through which local productivity changes lead to changes in high-skill employment ratio. The relationship between local real wage changes and the high-skill employment ratio should be examined first. Since changes in local productivity exert a significant influence on changes in the highskill employment ratio, changes in local real wages should be a main independent variable explaining changes in the high-skill employment ratio. Local real wages are defined as the wages net of local good prices: The figures in Column 4 of Table 8 shows a weak positive correlation between the increases in high-skill real wages and the changes in high-skill employment ratio. The results show that real wages can still explain the changes in high-skill employment ratio to some extent, but real wages are not the main driver of the increase in the spatial sorting trend. Those high-skilled workers who increasingly choose to live in cities with lower real wages even have to compensate for the lower real wages through urban amenities. Therefore, these reduced-form regression results are remarkably consistent with the stylized facts as well as the structural model estimates discussed earlier.
The results of estimating the impact of local productivity changes on real wages are presented in Column 5 of Table 8, and it is easy to see that an increase in high-skill productivity leads to an increase in high-skill real wages. High-skilled labor, who are paid high wages due to their high productivity, migrates to target cities and housing prices are thus pushed up. If such migration is accompanied by an increase in urban amenity levels, the migration trend stops when higher rent prices offset the benefits of high wages and high levels of urban amenities. The still-increasing spatial sorting trend implies that for high-skilled labor, the current increase in local good prices has not fully offset the benefits of higher incomes from migration and increased amenities.
The results of a similar regression for low-skill real wages are presented in Column 6 of Table 8. An increase in low-skill productivity leads to an increase in low-skill real wages, while an increase in high-skill productivity also leads to an increase in low-skill real wages. Column 5 of Table 8 shows that an increase in low-skill productivity leads to a decrease in real wages for high-skilled workers. This suggests that complementarity between high and low skills has a positive effect on the increase in low-skill real wages. Although this has some negative effects on high-skilled labor, with a slight decrease in the high-skill real wages, the absolute value of the elasticity of low-skill real wages with respect to the high-skill productivity changes is larger than the absolute value of the elasticity of high-skill real wages with respect to the low-skill productivity changes. These results imply that the net effect of skill complementarity is positive, and that skill complementarity is more beneficial to low-skilled labor.
Through the structural equation regression in Table 6 (B) and the reduced-form regression in Table 8, this paper provides a detailed description of the relationship between skill complementarity and knowledge spillover among workers with different skills. Summarizing the results, this paper finds that there is strong competition within same-skilled labor and that there is a strong knowledge spillover between high-and low-skilled labor.
The agglomeration of high-skilled labor raises the productivity of high-skilled labor as well as low-skilled labor, which is reflected in incomes-that is, higher wages for high-and lowskilled labor. Competitive relationships dominate within low-skilled labor, and there are few knowledge spillovers within them. The main spillover they receive comes from high-skilled labor; the net benefit of skill complementarity between high-and low-skilled labor is positive, but skill complementarity with low-skilled labor will negatively affect high-skill productivity to some extent, which will eventually be reflected in lower wages for high-skilled labor.
Complementarity between high-and low-skilled labor has a greater positive impact on lowskilled labor, significantly raising the wages of low-skilled labor.

Welfare implications and welfare inequality
From 2005 to 2017, the nominal wage gap and the real wage gap between high-and low-skilled labor gradually narrowed. However, changes in wage inequality do not necessarily coincide with changes in welfare inequality in the same direction. The additional welfare effects of local rents and amenities may increase or offset the welfare effects of wage changes. To measure how changes in urban wages, rents, and amenities affect welfare inequality, this paper performs a welfare decomposition. First, I hold local rents and amenities constant, assume that only urban wages change, and calculate the expected utility change for each labor force from 2005 to 2015. The expected utility of labor i from the city where he or she prefers to live can be written as follows: If wages are adjusted to the level actually observed in 2015, then the expected utility of labor i, EðÛ w i2015 Þ, can be written as follows: If wages and rents are adjusted to the levels actually observed in 2015, the expected utility of labor i can be expressed as follows: If wages, rents, and endogenous amenities due to resorting of workers are adjusted to the levels actually observed in 2015, the expected utility of labor i can be expressed as follows: Column 1 of Table 9 shows that from 2005 to 2015, the increase in the welfare gap between high-and low-skilled labor due to wage changes was equivalent to an increase of 0.039 log points in the wage gap between high-and low-skilled labor in the country, which was contrary to the ten-year trend of a 0.141 log-point decrease in the wage gap between high-and lowskilled labor. Even if local amenities and rents do not change, the welfare inequality between high-and low-skilled labor still increases due to local wage changes. Column 2 takes into account the additional effect of changes in local rents, showing that the change in welfare inequality between high-and low-skilled labor due to changes in wages and rents over ten years was equivalent to a 0.095 log-point increase in the wage gap between high-and lowskilled labor. The effect of wages and rents on welfare results in a large increase in welfare inequality. This is because while the rent gap between high-and low-skilled labor widens, and rents are higher in cities that offer desirable wages for high-skilled labor, the ratio of rents to wages for high-skilled labor is lower than that for low-skilled labor, so it is less stressful for high-skilled labor to pay the rents. Column 3 adds to the changes in endogenous amenities caused by changes in the highskilled employment ratio based on changes in wages and rents. I measured the impact of amenity changes on welfare inequality driven solely by labor resorting, fixing the national highskilled labor share at 2005 levels. The ten-year change in welfare inequality between high-and low-skilled labor caused by changes in wages, rents, and endogenous amenities driven by labor resorting was equivalent to a 0.087 log-point increase in the wage gap between high-and lowskilled labor. Welfare inequality increased by 123% compared to the case in which only the impact of wages on welfare was considered, and welfare inequality decreased by 8.4% compared to the case in which the impacts of wages and rents on welfare were considered. The results suggest that improvements in endogenous amenities contribute to reducing the welfare inequality gap between high-and low-skilled labor.
Assuming that the migration costs that distort the labor market are eliminated, migrants have full access to all resources (including urban amenities in their resident cities), and the utility loss of the labor leaving their places of residence to move across cities is reduced to zero. Under such an assumption, I once again measured how changes in urban wages, rents, and amenities from 2005 to 2015 impacted social welfare inequality. Table 10 shows that when migration costs are removed, the expected utility changes driven by wages and the expected utility changes driven by wages and rents over ten years were basically the same as when migration costs were present; however, the changes in welfare inequality caused by changes in wages, rents, and endogenous amenities have changed significantly. When migration costs were eliminated, the change in welfare inequality between high-and low-skilled labor caused by changes in wages, rents, and endogenous amenities driven by labor resorting over a decade was equivalent to a reduction of 0.145 log points in the wage gap between high-and low-skilled labor. The magnitude of this change is 25% larger than the tenyear reduction in the real wage gap between high-and low-skilled labor observed in the data, and 3% larger than the nominal wage gap in the ten-year reduction in the data. Welfare inequality decreased by 453% compared to the case where only the impact of wages on welfare was taken into account and migration costs were eliminated. Welfare inequality decreased by 251% compared to the case where the impact of wages and rents on welfare was considered and migration costs were eliminated. Welfare inequality decreased by 267% compared to the case where the impact of wages, rents, and endogenous amenities on welfare was considered, and migration costs were taken into account. The counterfactual results suggest that, on the one hand, an increase in the level of amenities facilitates the reduction of the welfare inequality gap between high-and low-skilled labor. On the other hand, if migration costs across cities are eliminated, migrants' access to urban amenities is no longer restricted, low-skilled labor can enjoy the more desirable amenities and gain additional utility compared to high-skilled labor. The welfare increases more for low-skilled labor, the effect of urban amenities in reducing the welfare inequality gap between high-and low-skilled labor will be further enhanced, and the welfare inequality gap will be reduced even more than the reduction in the nominal wage gap.

Conclusion
From 2005 to 2015, differences in high-and low-skilled migrant labor's location choice were caused by differences in the spatial distribution of the productivity of such labor. By using a structural spatial equilibrium model that estimated local labor demand, housing supply, labor supply, and amenity supply, I found that local productivity changes led to labor resorting across cities through several channels, and I quantified these effects. Estimates suggest that cities that are disproportionately productive for high-skilled labor attract a larger proportion of highskilled labor. The rising share of high-skilled labor in these cities leads to higher local productivity, which, in turn, drives up wages for all workers and improves the level of local amenities. A combination of desirable wages and improved amenities has led to a large influx of migrants, pushing up local rents. During this process, the wages of low-skilled labor grew faster, and the real wage gap between differently skilled workers gradually narrowed; simultaneously, the welfare gap between differently skilled workers expanded. Although improvements in the level of amenities can reduce the welfare inequality gap to a certain extent and make it more conducive for low-skilled workers to live in their target cities, the attractiveness of amenities to low-skilled workers is offset by higher rents. High-skilled workers are more capable of paying higher rents; thus, they are more sensitive to the level of urban amenities. Also, migration costs limit access to local amenities for low-skilled labor, allowing high-skilled labor to derive additional utility from more desirable amenities. If migration costs were eliminated, the reduction in welfare inequality between high-and low-skilled labor due to changes in wages, rents, and endogenous amenities would be 25% greater than the reduction in the real wage gap between high-and lowskilled labor, increasing the benefits of low-skilled labor relative to high-skilled labor.
Supporting information S1 Appendix.